Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add filters

Database
Language
Document Type
Year range
1.
Topics in Antiviral Medicine ; 30(1 SUPPL):75-76, 2022.
Article in English | EMBASE | ID: covidwho-1880033

ABSTRACT

Background: SARS-CoV-2 infection has resulted in over 219 million confirmed cases of COVID-19 with 4.5 million fatalities, highlighting the importance of elucidating mechanisms of severe disease. Here we utilized machine learning (ML) technologies to identify DNA methylation footprints of COVID-19 disease from publicly available data. Methods: Genome-wide DNA methylation of SARS-CoV-2 infected and uninfected patients using Illumina HumanMethylationEPIC microarray platform from whole blood was publicly available through NCBI Gene Expression Omnibus. A training cohort (GSE167202) consisting of 460 individuals (164 COVID-19-infected and 296 non-infected) and an external validation dataset (GSE174818) consisting of 128 individuals (102 COVID-19-infected and 26 non-COVID with pneumonia diagnosis) were obtained. COVID-19 severity score (SS) was classified as follows: 0. uninfected;1. released from department to home;2. admitted to in-patient care;3. progressed to ICU;and 4. death. Participants were then dichotomized by SS=0 or SS≥3. Raw data was processed using ChAMP in R 4.1.1, resulting in over 850,000 methylation sites per sample for analysis. Beta values were logit transformed to M values using CpGTools in Python 3.8.8. JADBio AutoML platform was leveraged to analyze these datasets with the goal of identifying a methylation signature indicative of COVID-19 disease. Results: From our training cohort, JADBio utilized LASSO feature selection (penalty=1.5) to identify 4 unique methylation sites capable of carrying the predictive weight of a classification random forest trained on 100 trees with Deviance splitting criterion (minimum leaf size=3). The average area under the curve of receiver operator characteristic (AUC-ROC) of the model was 0.933 (95% confidence interval [0.885, 0.970]), while the average area under the precision-recall curve (AUC-PRC) of 0.965 [0.932, 0.986]. When COVID-19 mild infections (SS = 1 or 2) were returned to the training dataset as an internal control, the model retained its predictive power (AUC-ROC=0.985, AUC-PRC=0.992). When applied to our external validation, this model produced an AUC-ROC of 0.901 with an AUC-PRC of 0.748. Conclusion: We developed a Random Forest Classification model capable of accurately predicting COVID-19 infection leveraging JADBio AutoML platform. These results enhance our understanding of epigenetic mechanisms used by SARS-CoV-2 in disease pathogenesis and identify potential therapeutic targets.

SELECTION OF CITATIONS
SEARCH DETAIL